Class Noise Mitigation Through Instance Weighting

نویسندگان

Umaa Rebbapragada

Carla E. Brodley

چکیده

We describe a novel framework for class noise mitigation that assigns a vector of class membership probabilities to each training instance, and uses the confidence on the current label as a weight during training. The probability vector should be calculated such that clean instances have a high confidence on its current label, while mislabeled instances have a low confidence on its current label and a high confidence on its correct label. Past research focuses on techniques that either discard or correct instances. This paper proposes that discarding and correcting are special cases of instance weighting, and thus, part of this framework. We propose a method that uses clustering to calculate a probability distribution over the class labels for each instance. We demonstrate that our method improves classifier accuracy over the original training set. We also demonstrate that instance weighting can outperform discarding.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Instance Weighting and Fine Tuning for Training Naïve Bayesian Classifiers with Scant data

This work addresses the problem of having to train a Naïve Bayesian classifier using limited data. It first presents an improved instance-weighting algorithm that is accurate and robust to noise and then it shows how to combine it with a fine tuning algorithm to achieve even better classification accuracy. Our empirical work using 49 benchmark data sets shows that the improved instance-weightin...

متن کامل

Modified RWGH and Positive Noise Mitigation Schemes for TOA Geolocation in Indoor Multi-hop Wireless Networks

Time of arrival (TOA) based geolocation schemes for indoor multi-hop environment are investigated and compared to some of conventional geolocation schemes such as least squares (LS) or residual weighting (RWGH). The multi-hop ranging involves positive multi-hop noise as well as non-line of sight (NLOS) and Gaussian measurement noise, so that it is more prone to ranging error than one-hop range....

متن کامل

A Co-evolutionary Framework for Nearest Neighbor Enhancement: Combining Instance and Feature Weighting with Instance Selection

The nearest neighbor rule is one of the most representative methods in data mining. In recent years, a great amount of proposals have arisen for improving its performance. Among them, instance selection is highlighted due to its capabilities for improving the accuracy of the classifier and its efficiency simultaneously, by editing noise and reducing considerably the size of the training set. It...

متن کامل

EP-based robust weighting scheme for fuzzy SVMs

Support vector machine (SVM) classifiers represent one of the most powerful and promising tools for solving classification problems. In the past decade SVMs have been shown to have excellent performance in the field of data mining. The standard SVM classifier treats all instances equally. However, in many applications we have different levels of confidence in different instances that belong to ...

متن کامل

Becoming More Robust to Label Noise with Classifier Diversity

It is widely known in the machine learning community that class noise can be (and often is) detrimental to inducing a model of the data. Many current approaches use a single, often biased, measurement to determine if an instance is noisy. A biased measure may work well on certain data sets, but it can also be less effective on a broader set of data sets [1]. In this paper, we present noise iden...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Class Noise Mitigation Through Instance Weighting

نویسندگان

چکیده

منابع مشابه

Combining Instance Weighting and Fine Tuning for Training Naïve Bayesian Classifiers with Scant data

Modified RWGH and Positive Noise Mitigation Schemes for TOA Geolocation in Indoor Multi-hop Wireless Networks

A Co-evolutionary Framework for Nearest Neighbor Enhancement: Combining Instance and Feature Weighting with Instance Selection

EP-based robust weighting scheme for fuzzy SVMs

Becoming More Robust to Label Noise with Classifier Diversity

عنوان ژورنال:

اشتراک گذاری